21 research outputs found
Dense Associative Memory is Robust to Adversarial Inputs
Deep neural networks (DNN) trained in a supervised way suffer from two known
problems. First, the minima of the objective function used in learning
correspond to data points (also known as rubbish examples or fooling images)
that lack semantic similarity with the training data. Second, a clean input can
be changed by a small, and often imperceptible for human vision, perturbation,
so that the resulting deformed input is misclassified by the network. These
findings emphasize the differences between the ways DNN and humans classify
patterns, and raise a question of designing learning algorithms that more
accurately mimic human perception compared to the existing methods.
Our paper examines these questions within the framework of Dense Associative
Memory (DAM) models. These models are defined by the energy function, with
higher order (higher than quadratic) interactions between the neurons. We show
that in the limit when the power of the interaction vertex in the energy
function is sufficiently large, these models have the following three
properties. First, the minima of the objective function are free from rubbish
images, so that each minimum is a semantically meaningful pattern. Second,
artificial patterns poised precisely at the decision boundary look ambiguous to
human subjects and share aspects of both classes that are separated by that
decision boundary. Third, adversarial images constructed by models with small
power of the interaction vertex, which are equivalent to DNN with rectified
linear units (ReLU), fail to transfer to and fool the models with higher order
interactions. This opens up a possibility to use higher order models for
detecting and stopping malicious adversarial attacks. The presented results
suggest that DAM with higher order energy functions are closer to human visual
perception than DNN with ReLUs
Unsupervised Learning by Competing Hidden Units
It is widely believed that the backpropagation algorithm is essential for
learning good feature detectors in early layers of artificial neural networks,
so that these detectors are useful for the task performed by the higher layers
of that neural network. At the same time, the traditional form of
backpropagation is biologically implausible. In the present paper we propose an
unusual learning rule, which has a degree of biological plausibility, and which
is motivated by Hebb's idea that change of the synapse strength should be local
- i.e. should depend only on the activities of the pre and post synaptic
neurons. We design a learning algorithm that utilizes global inhibition in the
hidden layer, and is capable of learning early feature detectors in a
completely unsupervised way. These learned lower layer feature detectors can be
used to train higher layer weights in a usual supervised way so that the
performance of the full network is comparable to the performance of standard
feedforward networks trained end-to-end with a backpropagation algorithm
Neuron-Astrocyte Associative Memory
Astrocytes, a unique type of glial cell, are thought to play a significant
role in memory due to their involvement in modulating synaptic plasticity.
Nonetheless, no existing theories explain how neurons, synapses, and astrocytes
could collectively contribute to memory function. To address this, we propose a
biophysical model of neuron-astrocyte interactions that unifies various
viewpoints on astrocyte function in a principled, biologically-grounded
framework. A key aspect of the model is that astrocytes mediate long-range
interactions between distant tripartite synapses. This effectively creates
``multi-neuron synapses" where more than two neurons interact at the same
synapse. Such multi-neuron synapses are ubiquitous in models of Dense
Associative Memory (also known as Modern Hopfield Networks) and are known to
lead to superlinear memory storage capacity, which is a desirable computational
feature. We establish a theoretical relationship between neuron-astrocyte
networks and Dense Associative Memories and demonstrate that neuron-astrocyte
networks have a larger memory storage capacity per compute unit compared to
previously published biological implementations of Dense Associative Memories.
This theoretical correspondence suggests the exciting hypothesis that memories
could be stored, at least partially, within astrocytes instead of in the
synaptic weights between neurons. Importantly, the many-neuron synapses can be
influenced by feedforward signals into the astrocytes, such as neuromodulators,
potentially originating from distant neurons.Comment: 18 pages, 2 figure
Sparse Distributed Memory is a Continual Learner
Continual learning is a problem for artificial neural networks that their
biological counterparts are adept at solving. Building on work using Sparse
Distributed Memory (SDM) to connect a core neural circuit with the powerful
Transformer model, we create a modified Multi-Layered Perceptron (MLP) that is
a strong continual learner. We find that every component of our MLP variant
translated from biology is necessary for continual learning. Our solution is
also free from any memory replay or task information, and introduces novel
methods to train sparse networks that may be broadly applicable.Comment: 9 Pages. ICLR Acceptanc
Long Sequence Hopfield Memory
Sequence memory is an essential attribute of natural and artificial
intelligence that enables agents to encode, store, and retrieve complex
sequences of stimuli and actions. Computational models of sequence memory have
been proposed where recurrent Hopfield-like neural networks are trained with
temporally asymmetric Hebbian rules. However, these networks suffer from
limited sequence capacity (maximal length of the stored sequence) due to
interference between the memories. Inspired by recent work on Dense Associative
Memories, we expand the sequence capacity of these models by introducing a
nonlinear interaction term, enhancing separation between the patterns. We
derive novel scaling laws for sequence capacity with respect to network size,
significantly outperforming existing scaling laws for models based on
traditional Hopfield networks, and verify these theoretical results with
numerical simulation. Moreover, we introduce a generalized pseudoinverse rule
to recall sequences of highly correlated patterns. Finally, we extend this
model to store sequences with variable timing between states' transitions and
describe a biologically-plausible implementation, with connections to motor
neuroscience.Comment: NeurIPS 2023 Camera-Ready, 41 page
Morphogenesis at criticality
Spatial patterns in the early fruit fly embryo emerge from a network of interactions among transcription factors, the gap genes, driven by maternal inputs. Such networks can exhibit many qualitatively different behaviors, separated by critical surfaces. At criticality, we should observe strong correlations in the fluctuations of different genes around their mean expression levels, a slowing of the dynamics along some but not all directions in the space of possible expression levels, correlations of expression fluctuations over long distances in the embryo, and departures from a Gaussian distribution of these fluctuations. Analysis of recent experiments on the gap gene network shows that all these signatures are observed, and that the different signatures are related in ways predicted by theory. Although there might be other explanations for these individual phenomena, the confluence of evidence suggests that this genetic network is tuned to criticality